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[57] ABSTRACT 

This is a fully parallel analog backpropagation learning 
processor which comprises a plurality of programmable 
resistive memory elements serving as synapse connec- 
tions whose values can be weighted during learning 
with buffer amplifiers, summing circuits, and sample- 
and-hold circuits arranged in a plurality of neuron lay- 
ers in accordance with delta-backpropagation algo- 
rithms modified so as to control weight changes due to 
circuit drift. 

14 Claims, 4 Drawing Sheets 
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ORIGIN ON THE INVENTION 

The invention described herein was made in the per- 
formance of work under a NASA contract, and is sub- 
ject to the provisions of Public Law 96-517 (35 USC 
202) in which the Contractor has elected not to retain 10 
title. 

TECHNICAL FIELD 

The invention relates to neural computing networks 
and, more particularly, in a neural computing network 15 
comprising at least three layers extending between at 
least one input at a first layer and at least one output at 
a third layer and comprising a plurality of neuron com- 
putational nodes interconnected by a plurality of syn- 
apse connections providing adjustably weighted paths 20 
between the neuron computational nodes, to the 
method of connection and operation for teaching the 
network comprising the steps of, prior to teaching, 
including at least one programmable resistive memory 
element representing the weight to be accorded the 25 
associated synapse connection in each of the synapse 
connections and, at the time of teaching, applying a 
known input value to the input of the neural computing 
network, calculating error signals representing the dif- 
ference between an actual output value from each neu- 30 
ron computational node having a synapse connection as 
an input thereto and a target output value in response to 
the known input value applied to the input of the neural 
computing network, and adjusting the programmable 
resistive memory element of each of the synapse con- 35 
nections according to a modified delta-backpropagation 
algorithm as a function of an associated one of the error 
signals wherein backpropagation is serially performed 
one layer at a time with the final layer being adjusted 
first. 40 

In the preferred embodiment, the step of adjusting 
the programmable resistive memory element of each of . 
the synapse connections according to a modified delta- 
backpropagation algorithm comprises the steps of, per- 
forming feedforward on one element of a training set; 45 
calculating an error signal which is the difference be- 
tween an actual output and a desired target value; multi- 
plying the error signal by a feedforward activation 
value weighted by the derivative of an output activation 
function; for each programmable resistive memory ele- 50 
ment that is connected to an output neuron computa- 
tional node, multiplying the error signal with the output 
of the previous layer neuron computational node; and, 
using the resultant product from the preceding step to 
adjust the weight represented by the associated pro- 55 
grammable resistive memory element by a small incre- 
ment. 

In one embodiment, the error signal is backpropa- 
gated through the output layer and through a second 
layer and the method additionally comprises the steps 60 
of, after said step of performing feedforward on one 
element of a training set, storing the error signal in a 
sample and hold apparatus and then physically switch- 
ing the output layer synapse connections from a feed- 
forward position to a backpropagation network to thus 65 
disable feedforward operation. 

Also in the preferred embodiment, the method addi- 
tionally comprises the step of physically switching the 
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synapse connections between a first or “top” position 
wherein each synapse connections is in a feedforward 
position, a second or “center” position wherein the 
synapse connection is reprogrammed based on the back- 
propagated error signal and the activation of the previ- 
ous layer neuron, a third or “bottom” position wherein 
the synapse connections are used to form the back- 
propagated error signal required to reprogram synapse 
connections in earlier layers. Additionally, there is the 
step of applying a threshold to a backpropagated error 
signal and not updating the programmable resistive 
memory element of each of the synapse connections if 
the backpropagated error signal is below the threshold. 
Preferably, the threshold is set to a maximum expected 
offset error whereby uncontrolled weight inflation asso- 
ciated with the programmable resistive memory ele- 
ments cannot occur. The preferred method also in- 
cludes the step of adjusting each of the programmable 
resistive memory elements an amount representing a 
very small fixed weight change in a direction that tends 
to cause a change of state in the neuron computational 
node following a programmable resistive memory ele- 
ment when the backpropagated error signal is below the 
threshold whereby learning precision of the neural 
computing network is improved. 

The preferred method also includes the steps of, prior 
to teaching, including at least two programmable resis- 
tive memory elements in each of the synapse connec- 
tions representing the weight to be accorded the associ- 
ated synapse connection and, at the time of teaching, 
applying a positive signal through one of the program- 
mable resistive memory elements to represent excitation 
and applying a negative signal through another of the 
programmable resistive memory elements to represent 
inhibition. 

BACKGROUND ART 

In the field of computing hardware used for process- 
ing data in an environment such as fault-tolerant learn- 
ing machines, autonomous control, pattern matching, 
artificial intelligence, robotics, etc., much excitement 
has been generated by neural network models that have 
the capability to learn. One particular model, delta- 
backpropagation (DB), has been successfully taught to 
perform a wide variety of tasks. DB shows promise for 
rapidly performing tasks that traditionally require great 
computational resources (e.g., image processing, pat- 
tern completion, and searching), because its neural net- 
work algorithm consists of many simple processing 
elements all working in parallel rather than one central 
processing element (i.e. the computer) working in a 
serial fashion as occurs in other forms of computing. 

To date, all DB studies have been accomplished in a 
simulated environment using serial computers, or com- 
puters with a limited number of parallel processors. 
Thus DB has not actually been used for such high pow- 
ered applications in normal and everyday use as would 
be desirable. In such simulated environments, digital 
computers are programmed to simulate the neural net- 
work DB algorithm. Consequently, the speed advan- 
tage inherent in the DB model is lost and simulation 
studies may take days or even weeks to run. A fully 
parallel hardware implementation is necessary for de- 
termining the utility of DB in solving large-scale com- 
putationally-intensive problems. Furthermore, the DB 
elements must be implemented in VLSI technology in 
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order to make feasible a system with great numbers of 
parallel elements. 

A fully parallel hardware implementation may take 
the form of analog circuitry, digital circuitry, or a hy- 
brid of the two. While a digital implementation may 5 
hold the advantage of higher precision with respect to 
mathematical computations, an analog system may be 
significantly simpler in terms of number of transistors 
and, consequently, more processing elements may fit 
onto a VLSI chip of a given area. The problem then is 10 
how to implement the delta backpropagation algorithm 
in an analog hardware form that lends itself to imple- 
mentation in VLSI. 

Essentially being modeled after the human brain, 
neural networks generally consist of a number of simple 15 
processing elements, called “neurons”, that are con- 
nected by conductive elements called “synapses”. The 
conductance of the synapses are continuously variable. 
Information is stored in these systems by synapse con- 
ductance values. One popular prior art scheme for con- 20 
necting neurons and synapses is depicted in FIG. 1. The 
network 10 is prompted by applying analog or digital 
signals to the input lines 12. This activates the neurons 
14 and synapses 16 in the network 10. The degree to 
which a given neuron 14 is activated depends on the 25 
activation of the neurons 14 in the previous layer as well 
as the conductances of the weights leading to that neu- 
ron 14. After the system has settled, the output nodes 18 
give the result. An electrical realization of such a prior 
art feedforward system is depicted in FIG. 2. The syn- 30 
apses 16 can be implemented as resistors and the neu- 
rons 14 as summers and threshold functions connected 
in series. Note that this configuration consists of three 
layers. The input layer, which can be as simple as a 
buffer or even a direct connection, is required to excite 35 
the first layer of synaptic elements. One or more hidden 
layers are required if the network 10 is to be capable of 
solving certain classes of problems. The output layer is 
required to sum the information from the hidden layer 
units, and possibly to threshold the resulting signals. 40 
Note that the number of layers, as well as the number of 
neurons 14 in each layer, are variables which must be 
selected by the neural network designer according to 
the task that the network 10 is to carry out. 

While such a feedforward system can be used once 45 
the synaptic weights (i.e. conductances) have been set, a 
major consideration of neural network design is how to 
adjust these weights. A popular method of weight ad- 
justment is the delta-backpropagation method. In this 
method, the network is trained by example. For a par- 50 
ticular task, the network is repeatedly trained by apply- 
ing representative input values and simultaneously ap- 
plying the associated desired target output values to the 
network. A backpropagation system is then used to 
modify the weights such that the target output is more 55 
likely to occur given the applied input. Because the 
weights cannot be changed greatly during each back- 
propagation pass (otherwise previously stored informa- 
tion may be corrupted), many thousands or even mil- 
lions of backpropagation training passes may be neces- 60 
sary to fully train such a network. 

One prior art attempt at solving the problem ad- 
dressed by the present invention employed VLSI ca- 
pacitive elements for storing weights as voltages. This 
approach, of course, has the great disadvantage that the 65 
capacitances tend to discharge with time. Thus, the 
circuit has to be kept at a low temperature so as to 
minimize charge leakage. Even despite such precau- 
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tions, however, the charges will dissipate slowly such 
that the weights represented by the capacitive charges 
need to be regenerated every day or two. This is intoler- 
able for most application and, therefore, a more perma- 
nent storage of weighting values is highly desirable, 
such as that provided by resistive elements in other 
neural network applications. In this regard, however, 
perhaps the most serious obstacle to designing a practi- 
cal DB processor is the lack of a suitable programmable 
resistive memory (RPM). Such memory elements, of 
course, are necessary to connect the various processing 
nodes of a neural network and store the information 
employed in the network. Among the required charac- 
teristics of such a PRM are high resistance, fast pro- 
grammability, and non-volatility. While such devices 
have yet to go into actual production, work is well 
underway at the Jet Propulsion Laboratory (JPL) in 
Pasadena, Calif, and other research facilities with re- 
spect to the production of practical PRMs. Prototype 
PRM elements have been fabricated in prototype form 
using thin-film deposition techniques; and, while these 
devices are not yet fast enough to be used in a DB 
system, the advancements made to date suggest that 
memory elements with the required characteristics on a 
commercial basis may not be too far off. 

STATEMENT OF THE INVENTION 

Accordingly, it is an object of this invention to pro- 
vide an analog implementation of the delta backpropa- 
gation algorithm within a neural network that lends 
itself to implementation in VLSI form. 

It is another object of this invention to provided an 
analog implementation of a neural network employing 
delta backpropagation and which includes programma- 
ble resistive memory elements as the storage for the 
weighting values employed therein. 

Other objects and benefits of the invention will be- 
come apparent from the description which follows 
hereinafter when taken with the drawing figures which 
accompany it. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a simplified drawing depicting a typical 
feedforward neural network according to the prior art. 

FIG. 2 is a simplified drawing showing an electrical 
implementation of the neural network of FIG. 1. 

FIG. 3 is a simplified drawing depicting a delta-back- 
propagation neural network according to the present 
invention when backpropagating the error signal from 
the last stage. 

FIG. 4 is a simplified drawing depicting a delta-back- 
propagation neural network according to the present 
invention when backpropagating the error signal from 
other than the last stage. 

FIG. 5 is a more detailed circuit description of the 
present invention. 

DETAILED DESCRIPTION OF THE 
INVENTION 

The present invention as now to be described is in- 
tended to perform a modified delta-backpropagation 
(DB) algorithm in an analog, fully parallel manner using 
programmable resistive memory (PRM) elements and 
circuitry implementing DB equations such as those 
given by Rumelhart, Hinton, and Williams (Chapter 8, 
Volume 1, Parallel Distributed Processing Rumelhart and 
McClelland, eds., MIT Press, 1986), a copy of which is 
enclosed herewith. With the exception of the PRM 
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devices, all circuit subsystems can be designed using 
standard CMOS VLSI techniques, including such sub- 
systems as summers, thresholding circuits, and switch- 
ing matrices. 

Electronic implementation of a backpropagation ap- 5 
proach to a neural network which could be imple- 
mented in VLSI logic posed an extremely complicated 
problem. One method according to the present inven- 
tion is depicted in FIGS. 3 and 4. In the neural circuits 
depicted therein, the backpropagation is serially per- 10 
formed one layer at a time. The final synaptic layer is 
adjusted first (see FIG. 3). Feedforward is performed 
on one element of a training set and an error signal is 
calculated at 20 that is the difference between the actual 
output and the desired target value. This error signal is 55 
multiplied at 22 by the feedforward activation weighted 
by the derivative from 24 of the output activation func- 
tion “g”. For each weight that is connected to that 
output neuron 14, the error signal is multiplied at 26 
with the output of the previous layer neuron 14, and the 20 
resultant product is used to adjust the weight repre- 
sented by the PRM 28 by a small increment. 

The backpropagation scheme for updating earlier 
synaptic layers is shown in FIG. 4. The error signal is ^ 
backpropagated through the output synaptic layer and 1 
through the second synapses layer. Because this is likely 
to require that the output layer synapses be physically 
switched from the feedforward position to be back- 
propagation network, thus disabling the feedforward, 
the error signal (or output) must be stored (e.g., using a 
sample/hold or A/D and D/A converters at 30). The 
weights update then follows a form similar to that of the 
output layer — i.e., the backpropagated error for a spe- 
cific synapse is multiplied by the activation of the neu- 35 
ron driving (in feedforward) that synapse, and this 
product is used to adjust the weight. Note that this 
procedure is recursive; that is, any number of hidden 
layers may be implemented by extending the foregoing 
scheme to more layers. 40 

A more detailed circuit description of the present 
invention is shown in FIG. 5. For simplicity, only one 
neuron 14 and synapse is shown at each layer, for a total 
of three neurons 14 (the input buffer being counted as a 
neuron 14) and two synapses 16. Several points should 45 
be noted here. First, since the synaptic connection can 
be negative or positive (corresponding to inhibition or 
excitation), a differential scheme using two conduc- 
tances is required (one may be fixed). These conduc- 
tances are driven by signals of opposite polarity, so the 50 
ratio of the two conductances determines whether the 
synapses 16 will inhibit or excite. Second, the synapses 
16 in this figure are shown in three different orienta- 
tions. It should be stressed that the synapses 16 must be 
physically switched (by switching means well known to 55 
those skilled in the art which is not included in the 
drawings for sake of simplicity) from one orientation to 
another; that is, the same synapse positions (i.e. pair of 
programmable resistors) are used in each the three (or 
two in the case of the input layer) positions. In a first or 60 
“top” position, each synapse 16 is in the feedforward 
position. In a second or “center” position, the synapse 
16 is reprogrammed based on the backpropagated error 
signal and the activation of the previous layer neuron 
14. (Note, depending on synaptic structure, it may be 65 
possible to reprogram synapses 16 while they are con- 
nected in the feedforward circuit.) In a third or “bot- 
tom” position, the synapses 16 are used to form the 


6 

backpropagated error signal required to reprogram 
synapses 16 in earlier layers. 

As those skilled in the art will readily know and ap- 
preciate, a disadvantage of analog circuit components is 
their tendency to generate small error (offset) voltages. 
In a computer simulation by the invention herein, it was 
found that certain portions of the backpropagation cir- 
cuits are sensitive to offset voltage drift. In particular, 
offset voltages can cause uncontrolled inflation of the 
weight values. Thus, it is important (and preferred) that 
the basic backpropagation algorithm be modified to 
take offset voltages into account. In the present inven- 
tion, this is accomplished by applying a threshold to the 
backpropagated error signal such that the weights are 
not updated if the error voltage is below the threshold. 
If the threshold is set to the maximum expected offset 
voltage error, then uncontrolled weight inflation can- 
not occur; however, to insure that the system learns 
properly in all cases, it may also be necessary to alter 
the weights slightly when the error is below threshold. 
Simulations show that learning precision is improved by 
a very small fixed weight change in the direction that 
tends to cause a change of state in the neuron 14 follow- 
ing the weight. 

It should be noted that the synapse programming 
circuit presented hereinbefore is by way of example and 
illustration of the present invention only and may differ 
from actual circuit designs as they will be dependent 
upon the synapse structure actually utilized in the final 
design. 

Wherefore, having thus described my invention, 
what is claimed is: 

1. In a neural computing network extending between 
at least one input on one end and at least one output on 
another end and comprising a plurality of neuron com- 
putational nodes interconnected by a plurality of syn- 
apse connections providing adjustably weighted paths 
between the neuron computational nodes, the improve- 
ment comprising: 

each of the synapse connections including at least one 
programmable resistive memory element repre- 
senting the weight to be accorded the associated 
synapse connection; 

circuit means for implementing a modified delta- 
backpropagation algorithm and for adjusting said 
programmable resistive memory element of each of 
the synapse connections according to said algo- 
rithm as a function of an error signal representing 
the difference between an actual output value from 
the neuron computational node having the synapse 
connection as an input thereto and a target output 
value in response to a known input value applied to 
the input of the neural computing network; and 

means for comparing a threshold relating to resis- 
tance drift in said programmable resistive memory 
element with the error signal so that said program- 
mable resistive memory element of each of the 
synapse connections is not updated as a function of 
said error signal if said backpropagated error signal 
is below said threshold. 

2. The improvement to the neural computing net- 
work of claim 1 and additionally comprising: 

means for setting said threshold to a maximum ex- 
pected offset error whereby uncontrolled weight 
inflation associated with said programmable resis- 
tive memory elements cannot occur. 

3. The improvement to the neural computing net- 
work of claim 2 and additionally comprising: 



7 

means for adjusting each of said programmable resis- 
tive memory elements by an amount representing a 
very small fixed weight change in a direction that 
tends to cause a change of state in the neuron com- 
putational node following said programmable resis- 
tive memory element in response to said back- 
propagated error signal falling below said thresh- 
old whereby learning precision of the neural com- 
puting network is improved. 

4 . The improvement to the neural computing net- 
work of claim 1 and additionally comprising: 

a) each of the synapse connections including at least 
two programmable resistive memory elements rep- 
resenting the weight to be accorded the associated 
synapse connection; 

b) means for applying an electrically positive signal 
through one of said programmable resistive mem- 
ory elements representing excitation; and, 

c) means for applying an electrically negative signal 
through another of said programmable resistive 
memory elements representing inhibition. 

5. In a neural computing network extending between 
at least one input on one end and at least one output on 
another end and comprising a plurality of neuron com- 
putation nodes interconnected by a plurality of synapse 
connections providing adjustably weighted paths be- 
tween the neuron computation nodes, the method of 
connection and operation for teaching the network 
comprising the steps of: 

a) prior to teaching, including at least one program- 
mable resistive memory element representing the 
weight to be accorded the associated synapse con- 
nection in each of the synapse connections; 

b) at the time of teaching; 

bl) applying a known input value to the input of 
the neural computing network, 
b2) calculating error signals representing the differ- 
ence between an actual output value from each 
neuron computational node having a synapse 
connection as an input thereto and a target out- 
put value in response to the known input value 
applied to the input of the neural computing 
network, 

b3) using a modified delta-backpropagation algo- 
rithm to adjust the programmable resistive mem- 
ory element of each of the synapse connections 
according to the algorithm as a function of an 
associated one of the error signals; and 

c) comparing a threshold relating to resistance drift in 
said programmable resistive memory element with 
a backpropagated error signal and not updating the 
programmable resistive memory element of each of 
the synapse connections as a function of said error 
signal if said backpropagated error signal is below 
said threshold. 

6. The method of claim 5 and additionally comprising 
the step of: 

setting the threshold to a maximum expected offset 
error whereby uncontrolled weight inflation asso- 
ciated with the programmable resistive memory 
elements cannot occur. 

7. The method of claim 6 and additionally comprising 
the step of: 

adjusting each of said programmable resistive mem- 
ory elements by an amount representing a very 
small fixed weight change in a direction that tends 
to cause a change of state in the neuron computa- 
tional node following said programmable resistive 
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memory element in response to said backpropa- 
gated error signal falling below said threshold 
whereby learning precision of the neural comput- 
ing network is improved. 

8. The method of claim 5 and additionally comprising 
the steps of: 

a) prior to teaching, connecting at least two program- 
mable resistive memory elements to each of the 
synapse connections representing the weight to be 
accorded the associated synapse connection; 

b) at the time of teaching, 

bl) applying an electrically positive signal through 
one of the programmable resistive memory ele- 
ments to represent excitation, and 
b2) applying an electrically negative signal through 
another of the programmable resistive memory 
elements to represent inhibition. 

9. In a neural computing network comprising at least 
three layers extending between at least one input at a 
first layer and at least one output at a third layer, each 
of said layers comprising a plurality of neuron computa- 
tional nodes, different layers being interconnected by a 
plurality of synapse connections providing adjustably 
weighted paths between the neuron computational 
nodes thereof, wherein the error signal is backpropa- 
gated through the output layer and through a second 
layer, the method of connection and operation for 
teaching the network comprising the steps of: 

a) prior to teaching, connecting at least one program- 
mable resistive memory element representing the 
weight to be accorded the associated synapse con- 
nection to each of the synapse connections; 

b) at the time of teaching, 

bl) applying a known input value to the input of 
the neural computing network, 
b2) calculating error signals representing the differ- 
ence between an actual output value from each 
neuron computational node having a synapse 
connection as an input thereto and a target out- 
put value in response to the known input value 
applied to the input of the neural computing 
network, and 

b3) adjusting the programmable resistive memory 
element of each of the synapse connections ac- 
cording to a modified delta-backpropagation 
algorithm as a function of an associated one of 
the error signals wherein backpropagation is 
serially performed one layer at a time with the 
final layer being adjusted first by the following 
steps: 

b3)i. performing feedforward on one element of 
a training set applied as predetermined inputs 
to nodes of said first layer, 
b3)ii. calculating an error signal which is the 
difference between an actual output and a 
desired target value, 

b3)iii. multiplying the error signal by a feed for- 
ward activation value weighted by the deriva- 
tive of an output activation function, 
b3)iv. for each programmable resistive memory 
element that is connected to an output neuron 
computational node, multiplying the error 
signal with the output of the previous layer 
neuron computational node, 
b3)v. using the resultant product from the previ- 
ous step to adjust the weight represented by 
the associated programmable resistive mem- 
ory element by a small increment, 
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b4) after the step of performing feedforward on one 
element of a training set, storing the error signal 
in a sample and hold apparatus; and 
b5) then physically switching the output layer syn- 
apse connections from a feedforward position to 5 
a backpropagation network to thus disable feed- 
forward operation. 

10 . The method of claim 9 and additionally compris- 
ing the step of: 

physically switching the synapse connections be- 
tween a first or “top” position wherein each syn- 
apse connections is in a feedforward position, a 
second or “center” position wherein the synapse 
connection is reprogrammed based on the back- 
propagated error signal and the activation of the 
previous layer neuron, a third or “bottom” position 
wherein the synapse connections are used to form 
the backpropagated error signal required to repro- 
gram synapse connections in earlier layers. 

11 . The method of claim 9 and additionally compris- 
ing the step of: 

applying a threshold to a backpropagated error signal 
and not updating the programmable resistive mem- 
ory element of each of the synapse connections if 
the backpropagated error signal is below the 
threshold. 

12 . The method of claim 11 and additionally compris- 
ing the step of: . 
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setting the threshold to a maximum expected offset 
error whereby uncontrolled weight inflation asso- 
ciated with the programmable resistive memory 
elements cannot occur. 

13 . The method of claim 12 and additionally compris- 
ing the step of: 

adjusting each of said programmable resistive mem- 
ory elements by an amount representing a very 
small fixed weight change in a direction that tends 
10 to cause a change of state in the neuron computa- 
tional node following said programmable resistive 
memory element in response to said backpropa- 
gated error signal falling below said threshold 
whereby learning precision of the neural comput- 
15 ing network is improved. 

14 . The method of claim 9 and additionally compris- 
ing the steps of: 

a) prior to teaching, connecting at least two program- 
mable resistive memory elements to each of the 

20 synapse connections represented the weight to be 
accorded the associated synapse connection; 

b) at the time of teaching, 

bl) applying an electrically positive signal through 
one of the programmable resistive memory ele- 
25 ments to represent excitation, and 

b2) applying an electrically negative signal through 
another of the programmable resistive memory 

elements to present inhibition. 
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